Join Queries with External Text Sources : Execution and Optimization
نویسندگان
چکیده
Text is a pervasive information type, and many applications require querying over text sources in addition to structured data. This paper studies the problem of query processing in a system that loosely integrates an extensible database system and a text retrieval system. We focus on a class of conjunctive queries that include joins between text and structured data, in addition to selections over these two types of data. We adapt techniques from distributed query processing and introduce a novel class of join methods based on probing that is especially useful for joins with text systems, and we present a cost model for the various alternative query processing methods. Experimental results connrm the utility of these methods. The space of query plans is extended due to the additional techniques, and we describe an optimization algorithm for searching this extended space. The techniques we describe in this paper are applicable to other types of external data managers loosely integrated with a database system.
منابع مشابه
Sprinkling Selections over Join DAGs for Efficient Query Optimization
In optimizing queries, solutions based on AND/OR DAG can generate all possible join orderings and select placements before searching for optimal query execution strategy. But as the number of joins and selection conditions increase, the space and time complexity to generate optimal query plan increases exponentially. In this paper, we use join graph for a relational database schema to either pr...
متن کاملExecution in a Parallel Main - Memory EnvironmentAnnita
In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study, are a step into the direction of the design of a query optimization strategy that is t for parallel execution of complex queries. Among others, synchronization issues are identiied to limit the performance gain from parallelism. A new hash-join alg...
متن کاملA Model for Data ow Query Execution in a ParallelMain - Memory Environment
This paper develops an analytical model for the behavior and the performance of multi-join queries. The model is simple, and it increases the insight in the essentials of dataaow query execution. Multi-join queries are studied using this model. The results of this study connrm the results of a previous simulation study of multi-join queries. The gained understanding of dataaow query execution w...
متن کاملAdaptive Optimization of Very Large JoinQueries
The use of business intelligence tools and other means to generate queries has led to great variety in the size of join queries. While most queries are reasonably small, join queries with up to a hundred relations are not that exotic anymore, and the distribution of query sizes has an incredible long tail. The largest real-world query that we are aware of accesses more than 4,000 relations. Thi...
متن کاملComplex Query JOIN Optimization in Parallel Distributed Environment
The research work covers the query optimization concept in parallel distributed environment. The queries considered are select-project-join (SPJ) queries with large databases. The main query operation considered for research is JOIN operation of the query. For fast execution of a complex query, JOIN operation time needs to be minimized. Different JOIN operation algorithms such as Network Byte O...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995